Wide-angle dewarping method and apparatus

ABSTRACT

A method and apparatus for transforming wide angle video into perspective corrected viewing zones which either a single user or multiple users may select, orient and magnify. The present invention first captures a wide angle digital video input by any suitable means. The captured image is then stored in a suitable memory means so portions of the image may be selected at a later time. When a portion of the stored video is selected for viewing, a plurality of discrete viewing vectors in three dimensional space are chosen on the video input and transformed to a plurality of control points in a two dimensional plane or any other suitable surface. The area between these points which is still warped from the original wide angle image capture is then transformed to a perspective corrected field of view. The perspective corrected field of view is then displayed on a suitable displaying apparatus, such as a monitor or head mounted display.

This is a reissue of U.S. Pat. No. 7,042,497, which is a continuation ofapplication Ser. No. 09/429,697, filed Oct. 28, 1999, now U.S. Pat. No.6,346,967 which is a continuation of prior application Ser. No.09/128,963, filed Aug. 4, 1998, U.S. Pat. No. 6,005,611, and which was acontinuation of application Ser. No. 08/250,594 filed May 27, 1994, U.S.Pat. No. 5,796,426, incorporated herein by reference.

FIELD OF THE INVENTION

The present invention relates to a method and apparatus for displaying aperspective corrected field of view from wide angle video sources, andmore particularly relates to permitting the user of an orientationsensing means to view a selected portion of stored or real time videoencoded from a wide angle source and transforming that portion to aperspective-corrected field of view.

BACKGROUND OF THE INVENTION

“Virtual reality” and “telepresence” have become extremely popular foruse in research, industrial and entertainment applications In “virtualreality”, or VR, a user is permitted to view a computer-generatedgraphical representation of a selected environment. Depending on thesophistication of the hardware and software used to generate the virtualreality environment, the user may be treated to a three dimensional viewof the simulated environment. In “telepresence,” a user is permitted toview a real-world, live or recorded environment from a three dimensionalperspective.

In addition, in some higher end systems the user is permitted to seedifferent portions of the VR and telepresence environments simply bymoving or orienting his head in one or more degrees of freedom. Thispermits the user to obtain the sensation that he is immersed in thecomputer-generated/real-world environment. High end devices detect pan,roll and tilt motions by the user and cause the environment to changeaccordingly. The pan\tilt\roll motions may be inputted by many types ofinput devices, such as joysticks, buttons or head orientation sensors(which may be connected to head mounted displays).

In VR applications, a continuing problem is how to render a threedimensional environment of the quality and speed users want whileoffering the product at a price they can afford. To make a realisticenvironment, such as in a three dimensional video game, many threedimensional polygons need to be rendered. This rendering requiresprohibitively expensive hardware which greatly restricts the commercialvalue of such a system.

In relation to telepresence applications, a continuing problem with theprior art is how to encode sufficient data that a viewer may arbitrarilymove his viewing perspective within the telepresence environment and notlook beyond the field of view. One relatively simple solution, where thetelepresence environment is based on a real three dimensionalenvironment, is to simply use the head orientation sensors to cause acamera to track the orientation of the viewer. This has obviouslimitations in that only one viewer can be in the telepresenceenvironment at a time (since the camera can only track one viewer, andthe other viewers will not typically be able to follow the head motionsof the controlling viewer) and, also, prerecorded data cannot be used.Further, there is an inherent delay between a change in user viewingperspective and the time that it takes to realign the correspondingcamera. These limitations greatly restrict the value of such systems.

One method for overcoming each of these limitations is to encode, eitherin real time or by pre-recording, a field of view largely equivalent tothe entire range of motion vision of a viewer—that is, what the viewerwould see if he moved his head in each permitted direction throughoutthe entire permissible range. For example, encoding substantially a fullhemisphere of visual information would permit a plurality of viewers areasonable degree of freedom to interactively look in a range ofdirections within the telepresence environment.

The difficulty with this approach is that most means for encoding suchinformation distort, or warp, the visual data, so that the informationmust be corrected, or “de-warped” before a viewer can readily assimilateit. For example, a typical approach for encoding substantially a fullhemisphere of information involves using a fish-eye lens. Fish-eyelenses, by their nature, convert a three dimensional scene to atwo-dimensional representation by compressing the data at the peripheryof the field of view. For the information to be viewed comfortably by aviewer in the VR environment, the visual data must be decompressed, ordewarped, so that it is presented in normal perspective as a twodimensional representation.

One solution to the distortion problem is proposed in U.S. Pat. No.5,185,667 issued to Steven Zimmerman. The '667 patent describes anapparatus which effects camera control for pan, tilt, rotate and zoomwhile having no moving parts. Through the use of a fisheye lens and acomplicated trigonometric technique, portions of the video images can bedewarped. However, the solution proposed by the '667 patent isimpractical because it is insufficiently flexible to accommodate the useof other lenses besides a theoretically perfect hemispherical fisheyelens without the introduction of mathematical errors due to the misfitbetween the theoretical and the actual lens characteristics. Thissolution also introduces undesirable trigonometric complexity whichslows down the transformation and is overly expensive to implement. Thissolution further maps each individual pixel through the complextrigonometric mapping formula further reducing the speed of thetransformation from one coordinate system to another.

As a result, there has been a substantial need for a method andapparatus which can dewarp encoded wide angle visual data withsufficient speed and accuracy to permit a viewer to immerse himself in aVR or telepresence environment and look around within the environmentwhile at the same time permitting other viewers concurrently toindependently engage in the same activity on the same broadcast videosignal. There has also been a need for a method and apparatus capable ofproviding such dewarping on a general purpose high speed computer.

SUMMARY OF THE INVENTION

The present invention overcomes the limitations of the prior art. Inparticular, the present invention transforms a plurality of viewingvectors within a selected portion of the wide angle, three dimensionalvideo input into two dimensional control points and uses a comparativelysimple method to transform the image between the control points tocreate a perspective-corrected field of view.

More specifically, the present invention is drawn to a method andapparatus which provides perspective corrected views of live,prerecorded or simulated wide angle environments. The present inventionfirst captures a wide angle digital video input by any suitable means,such as through the combination of a high resolution video camera,hemispherical fisheye lens and real time digital image capture board.The captured image is then stored in a suitable memory means so portionsof the image may be selected at a later time.

When a portion of the stored video is selected, a plurality of discreteviewing vectors in three dimensional space are chosen and transformedinto a plurality of control points in a corresponding two dimensionalplane. The area between the control points, which is still warped fromthe original wide angle image capture, is then transformed into aperspective corrected field of view through a biquadratic polynomialmapping technique. The perspective corrected field of view is thendisplayed on a suitable displaying apparatus, such as a monitor or headmounted display. The present invention further has the ability to sensean inputted selection, orientation and magnification of a new portion ofthe stored video for transformation.

In comparison with the prior art, the present invention provides adependable, low cost, faster and more elegantly simple solution todewarping wide angle three dimensional images. The present inventionalso allows for simultaneous dynamic transformation of wide angle videoto multiple viewers and provides each user with the ability to accessand manipulate the same or different portions of the video input. In VRapplications, the present invention also allows the computer generatedthree dimensional polygons to be rendered in advance; thus, users mayview the environments from any orientation quickly and without expensiverendering hardware.

It is therefore one object of the present invention to provide a methodand apparatus for dewarping wide angle video to a perspective correctedfield of view which can then be displayed.

It is another object of the present invention to provide a method andapparatus which can simultaneously transform the same or differentportions of wide angle video input for different users.

It is yet another object of the present invention to provide a methodand apparatus which allows selection and orientation of any portion ofthe video input.

It is still another object of the present invention to provide a methodand apparatus for magnification of the video input.

It is still another object of the present invention to provide a methodand apparatus which performs all of the foregoing objects while havingno moving parts.

These and other objects of the invention will be better understood fromthe following Detailed Description of the Invention, taken together withthe attached Figures.

THE FIGURES

FIG. 1 shows a functional block diagram of one embodiment of the presentinvention.

FIG. 2 diagrams the geometry between three dimensional (X-Y-Z) space andits corresponding two dimensional (U-V) plane.

FIG. 3a shows a bilinear mapping of a warped image.

FIG. 3b shows a biquadratic mapping of a warped image.

FIG. 4 shows a side view of a viewing vector from a three dimensional(X-Y-Z) wide angle lens as it is seen on a two dimensional (U-V) plane.

FIG. 5 shows a three dimensional field of view along with a plurality ofviewing vectors according to the present invention.

FIG. 6 shows a block diagram of the elements of a forward texturemapping ASIC according to the present invention.

FIG. 7 shows an example of a U-V source texture transformed into a X-Yplane destination texture according to the present invention.

FIG. 8 is one embodiment of how to obtain a 360 degree view using sixhemispherical fisheye lenses according to the present invention.

FIG. 9 is a functional flow chart of one embodiment of the presentinvention.

DETAILED DESCRIPTION OF THE INVENTION

Referring now to FIG. 1, an embodiment of the present invention forprocessing wide angle video information in real time is shown. A highresolution video camera 10 having a wide angle lens 20, such as ahemispherical fisheye lens, is directed to a real world scene 22. Theoutput 24 of the camera 10 is provided to a real time image digitizingboard 30, commonly referred to as a “frame grabber,” located in oroperatively connected to a conventional high speed computer indicatedgenerally at 150. The camera 10 may be any camera which is capable ofusing a wide angle lens and providing suitable resolution. In mostinstances the camera will be a video camera, although in some instancesit may be desirable to use a still frame camera. One suitable fisheyelens is the Nikkor Auto 1:1.8 f=8 mm fisheye lens which can be adaptedto a standard high resolution broadcast video camera with a Nikon FW-EngTMW-B1 converter. The computer 150 is any computer capable of receivingand processing video information at an acceptable rate and may, forexample, be an 80486-based or Pentium™-based system, or other computerplatform such as are made by Silicon Graphics, Sun Micro Systems, AppleComputer, or similar other computer manufacturers.

The fisheye lens 20 causes the video output signal 24 from the camera 10to be optically warped in a non-linear manner. Before the image can becomfortably viewed by a user, perspective-correcting measures must betaken. The digitized video signal 24 is thus transferred through thedigitizing board 30 (typically but not necessarily operating at 30frames per second) into memory 40 of the computer 150 so that portionsof the video picture can be randomly accessed by a microprocessor 50,also within the computer 150, at any time.

The dewarping software is also stored in memory 40 and is applied to thevideo signal 24 by the microprocessor 50. The stored video signal isthen transmitted from memory 40 to a special purpose ASIC 60 capable ofbiquadratic or higher order polynomial transformations for texturewarping and interpolation. Alternatively, the texture warping ASIC 60may be omitted and its functionality may be performed by software.Phantom lines have been used to show the optional nature of ASIC 60. Theperspective corrected video signal is next transmitted to a video outputstage 70, such as a standard VGA card, and from there displayed on asuitable monitor, head mounted display or the like 80. An input device90, such as a joystick or headtracker (which senses the head movementsof a user wearing a headmounted display), transmits position informationthrough a suitable input port 100, such as a standard serial, parallelor game port, to the microprocessor 50 to control the portion of thestored video that is selected, dewarped and displayed. The input device90 also transmits roll/pitch/yaw information to the microprocessor 50 sothat a user may control the orientation of the dewarped video signal.Further, one skilled in the art will appreciate that a magnificationoption could be added to the input device 90 to allow the user tomagnify the selected portion of video input, constrained only by theresolution of the camera 10.

FIG. 2 shows a real world three dimensional environment 200 which hasbeen imaged by the wide angle lens 20. This environment is defined bythe Cartesian coordinate system in X, Y and Z with the viewpoint definedto be the origin of the coordinate system. The viewing direction of theuser, as defined by the input device 90, is given as a viewing vector inthe X-Y-Z coordinate system. The image plane 210 containing the warpedwide angle image is defined by a two dimensional coordinate system in Uand V, with the origin of the coordinate system coincident with theorigin of the X-Y-Z coordinate system. If the field of view of the lens20 is sufficient, and the lens is rotationally symmetric about theviewing axis, the digitized warped image will be roughly circular in theU-V plane.

The first generation of ASICs, developed for low-cost texture mapping ofthree dimensional graphics, mapped video images through a bilineartechnique, such as is shown in FIG. 3(a). These chips were able to applylinear interpolation to texture pixels in both the X and Y directionsand could thereby stretch rectangular source textures to any twodimensional quadrilateral shape. An example of a chip of this type isthe Artist Graphics 3GA chip. These bilinear chips do, however,introduce texture errors for polygons whose vertices have been subjectto significant amounts of perspective, and further are not capable ofsufficiently high order texture distortion to adequately flatten extremewide angle views, such as with hemispherical fisheye lenses.

FIG. 3b shows an example of a biquadratic technique, such as is nowcoming onto the market. The preferred embodiment of the presentinvention uses an ASIC chip which implements a texture warping techniqueof at least second polynomial order. The present invention is ofsufficient simplicity that this technique could also be implemented insoftware on a general purpose high speed computer, such as a SiliconGraphics Indigo™ computer or a Pentium™ based computer.

The warped image in the U-V plane, shown in FIG. 2, has a radius 220equal to RADIUS pixels with an origin at UORIGIN and VORIGIN. For anyrotationally symmetric lens, the warping effect of the lens can bedescribed by a single lens equation, r=f(θ), where the function f(θ)maps any incident ray at angle θ from the axis of viewing to a radialdisplacement in pixels, r, from the center of the U-V plane, as shown inFIG. 4.

For any given viewing direction in three dimensional X-Y-Z space, wethen have:

$\begin{matrix}{s = \sqrt{x^{2} + y^{2}}} & (1) \\{\theta = {{arc}\;\tan\;\frac{s}{z}}} & \; \\{r = {f(\theta)}} & \; \\{u = \frac{rx}{s}} & \; \\{v = \frac{ry}{s}} & \;\end{matrix}$In the case of an ideal hemispheric fisheye lens, f(θ)=(RADIUS)(sin(θ))and the lens equation which results is:

$\begin{matrix}{r = {\frac{({RADIUS})(s)}{z\sqrt{1 + \frac{s^{2}}{z^{2}}}} = \frac{({RADIUS})\sqrt{x^{2} + y^{2}}}{z\sqrt{1 + \frac{x^{2} + y^{2}}{z^{2}}}}}} & (2)\end{matrix}$Equations (1) convert an inputted X-Y-Z three dimensional viewing vectorinto a corresponding control point in the U-V plane.

To dewarp a rectangular portion of the wide angle video input for agiven viewing direction (x,y,z), eight other viewing vectors, whichsurround the primary viewing vector, are computed at the field of viewangles fov_h and fov_v from the primary viewing vector, as shown in FIG.5. Each of these nine vectors are then projected from three dimensionalX-Y-Z space to the two dimensional U-V plane by equations (1). Theresult is a 3×3 grid of control points in the U-V plane, with the edgesof the grid curving mathematically to conform with the curvature inducedby the warping effect of the wide angle lens.

The global bivariate polynomial transformation

$\begin{matrix}{u = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{a_{ij}x^{i}y^{j}}}}} & (3) \\{v = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{b_{ij}x^{i}y^{j}}}}} & \; \\{\text{with~~x~~and~~y~~now referring~~to~~the~~pixel~~coordinates}\text{in~~the~~output~~device}} & \;\end{matrix}$is then found to describe the geometric correction necessary totransform the region within the warped 3×3 grid in the U-V plane into aperspective corrected field of view. A biquadratic polynomialtransformation, N=2 in the above equations, has been selected because asecond order polynomial approximates the warping characteristics of mostlenses to an adequately high degree of precision and because there isexisting hardware to perform the resulting biquadratic transformation.However, it will be appreciated by one skilled in the art that otherpolynomial transformations of higher degree could be used to increasethe precision of the transformation.

Expanding the above equations (3):

$\begin{matrix}{\begin{bmatrix}\begin{matrix}\begin{matrix}\begin{matrix}\begin{matrix}\begin{matrix}\begin{matrix}\begin{matrix}u_{1} \\u_{2}\end{matrix} \\u_{3}\end{matrix} \\u_{4}\end{matrix} \\u_{5}\end{matrix} \\u_{6}\end{matrix} \\u_{7}\end{matrix} \\u_{8}\end{matrix} \\u_{9}\end{bmatrix} = {\begin{bmatrix}1 & x_{1} & y_{1} & {x_{1}y_{1}} & x_{1}^{2} & y_{1}^{2} \\1 & x_{2} & y_{2} & {x_{2}y_{2}} & x_{2}^{2} & y_{2}^{2} \\1 & x_{3} & y_{3} & {x_{3}y_{3}} & x_{3}^{2} & y_{3}^{2} \\1 & x_{4} & y_{4} & {x_{4}y_{4}} & x_{4}^{2} & y_{4}^{2} \\1 & x_{5} & y_{5} & {x_{5}y_{5}} & x_{5}^{2} & y_{5}^{2} \\1 & x_{6} & y_{6} & {x_{6}y_{6}} & x_{6}^{2} & y_{6}^{2} \\1 & x_{7} & y_{7} & {x_{7}y_{7}} & x_{7}^{2} & y_{7}^{2} \\1 & x_{8} & y_{8} & {x_{8}y_{8}} & x_{8}^{2} & y_{8}^{2} \\1 & x_{9} & y_{9} & {x_{9}y_{9}} & x_{9}^{2} & y_{9}^{2}\end{bmatrix}\begin{bmatrix}a_{00} \\a_{10} \\a_{01} \\a_{11} \\a_{20} \\a_{02}\end{bmatrix}}} & (4)\end{matrix}$The values for v and b_(ij) can be similarly found. In matrix form, theexpanded equations (4) can be written as:U=WAV=WB  (5)To discover a_(ij) and b_(ij) according to the method of the presentinvention, a pseudo-inverse technique is used. However, one skilled inthe art will appreciate that there are methods to solve equations (5)other than by a pseudo inverse technique, i.e. a least squarestechnique. The pseudo-inverse solutions for A and B in the aboveequation (5) are:A=(W^(T)W)⁻¹W^(T)UB=(W^(T)W)⁻¹W^(T)V  (6)Therefore, for a target display Cot a given pixel resolution N×M, W andits pseudo-inverse (W^(T)W)⁻¹W^(T) can be calculated a priori. Thevalues for a_(ij) and b_(ij) are then found by mapping the points in theU-V plane for the 3×3 grid of control points using the above equations(6). The biquadratic polynomial transformations of the equations (3) arethen used to transform the area between the control points. In thisembodiment, the determination of the coordinates of each pixel in theU-V plane takes a total of thirteen multiplication and ten additionoperations. Additionally, three of the required multiplicationoperations per pixel may be obviated by storing a table of xy, x² and y²values for each xy coordinate pair in the dewarped destination image. Inanother embodiment, the “x” values which do not vary as “y” changes(i.e. a₁*x+a₄*x² and b₁*x+b₄*x²) may also be precomputed and stored.Likewise, the “y” values which do not vary as “x” changes may beprecomputed and stored. These further optimizations reduce theoperations needed to determine the coordinates of each pixel in the U-Vplane to two multiplication and four addition operations.

It will be appreciated by one skilled in the art that the accuracy ofthe dewarping transformation will increase as the number of transformedviewing vectors increases, i.e. a 4×4 grid of control points willproduce a more accurate transformation than a 3×3 grid of controlpoints. However, the amount of increase in accuracy quickly draws nearan asymptote as the number of control points is increased. One skilledin the art will recognize, therefore, that there is little reason inincreasing the number of viewing vectors to more than half of the totalnumber of pixels in the displayed region.

It will be further appreciated by one skilled in the art that theselection of a rectangular shape of the video input could be changed toanother shape and still be within the scope of the invention. Further,the number of control points could be increased or decreased tocorrespondingly increase or decrease the accuracy of the transformation.Further still, an image filtering stage could be applied during theinverse texture mapping without deviating from the present invention.

FIG. 9 shows a functional flow chart of the major elements of oneembodiment of the present invention. First, the fixed warped imageparameters (step 400) are defined, such as the size of the input image,the input image radius, and the input image center in U-V coordinatestypically measured in pixels. The next step 410 is to initialize thevariable dewarped image parameters, such as the size of the dewarpedimage area, the horizontal and vertical field of views (generally shownin degrees), the creation of an untransformed view cone centered in thisembodiment on the +Z axis and the initialization of the layout andnumber control points used therewith. Typically, the next step is toload the precomputed inner-loop matrix values as well as the “xy”product terms, as shown in step 420, to ensure that the transformationis accomplished as quickly and efficiently as possible. In step 430, thevideo signal is input to the system in any suitable form, i.e. live orpre-recorded real-time digitized video or computer synthesized videoenvironments. The system then allows the user to select the viewingvector (step 440) which in turn determines the portion of video which isto be transformed. The control points are next transformed from theselected viewing vectors (step 450) and the region defined by thecontrol points is dewarped (step 460). The signal is then sent to thevideo buffer and to an appropriate viewing apparatus (step 470). Thereis a recursive loop from step 470 to step 430 to allow the video signalto be refreshed, as is needed with live motion video. The loop alsoallows the user to make on-the-fly selections of alternate portions ofthe incoming video.

The foregoing description describes an inverse texture mapping techniquewhereby the biquadratic output (X-Y) is mapped to the input (U-V). Inthe case where a forward texture mapping ASIC is used, the coordinatesfor the destination control points in X-Y must be supplied so that therectilinear source texture region can be mapped from the U-V plane, asprovided by the inverse texture mapping software solution above, to theX-Y plane. An example of a forward texture mapping ASIC is the NV-1 chipsold by N-Vidia Corporation. FIG. 6 gives another example of an ASICchip 230 which accepts control pixel coordinates at a forward mappingsolution stage 240. Four pixel coordinates will be accepted for abilinear mapping, nine in the case of quadratic mapping, 16 in the caseof cubic mapping, etc. These control pixels are produced in the host CPUaccording to the equations (7), below. As shown in FIG. 7, thecoordinates of a rectangular bounding box are determined to enclose anexemplary 3×3 grid of control points u_(i)v_(i) in the U-V plane. Thecorners of the bounding box are found from the u_(i)v_(i) extrema. Usingthe same technique described in equations (3), a_(ij) and b_(ij) for N=2can be solved with the equations (7) for enclosing the region to bewarped to the corners of the display screen.

$\begin{matrix}{x = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{a_{ij}u^{i}v^{j}}}}} & (7) \\{y = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{b_{ij}u^{i}v^{j}}}}} & \;\end{matrix}$

Thus, the same control points for the U-V plane map to the corners ofthe display screen in the X-Y plane. The warped regions outside thebounding box may be clipped by hardware or software so that they are notvisible on the display screen.

The source pixel coordinates, which are fed from the host CPU, areconverted to a_(ij) and b_(ij) coordinates for forward mapping in theforward mapping solution stage 240, again using techniquesmathematically equivalent to those of the equations (7). A series ofinstructions is further sent from the host CPU to the chip 230 andreceived by a control unit 260. The control unit 260 sequences andcontrols the operation of the other functional stages within the chip230. The host CPU also directs a linear sequence of source pixels, whichare to be warped, to an interpolation sampler stage 250 within chip 230.Optionally, these can be subject to a low-pass spatial prefilteringstage 270 prior to transmission to the chip, to reduce sampling errorduring the warping process. Thus, within the chip 230, the source pixelsand the a_(ij) and b_(ij) coordinates are both fed to the interpolationsampler 250. For each input pixel, one or more destination pixelstogether with their corresponding X-Y destination coordinates areproduced. These warped pixels are then fed into the video frame buffer280, located outside of the ASIC chip 230. Optionally, anti-aliasingcircuitry 290 within the chip performs interpolation on output pixelvalues, such as bilinear interpolation between adjacent pixel samples,to minimize the effects of output spatial quantization error. Oneskilled in the art will recognize that the preceding hardware solutionis merely exemplary and that there are many such solutions which couldbe employed and still fall within the scope of the present invention.

The techniques described herein may also be applied to synthetic images.Such images may be created entirely within a computer environment andmay be composed of three dimensional geometrical descriptions of objectswhich can be produced by computer graphics rendering techniquesgenerally known to those skilled in the art. Typically, synthetic imagesare produced by linear perspective projection, emulating the physicalprocess of imaging onto planar film with a lens having a narrow field ofview and producing a view of the synthetic environment as seen through acone or truncated three dimensional pyramid. The color, intensityshading and other simulated physical properties of each pixel on theplanar image grid can also be readily determined. For a syntheticenvironment, the viewing vectors in X-Y-Z space are rewritten in termsof the warped control points coordinates in the U-V plane

$\begin{matrix}{r = \sqrt{u^{2} + v^{2}}} & (8) \\{s = \frac{r}{RADIUS}} & \; \\{x = \frac{us}{r}} & \; \\{y = \frac{vs}{r}} & \; \\{z = \sqrt{1 - s^{2}}} & \;\end{matrix}$

A direction vector in X-Y-Z space can thus be generated for each pixelin the U-V plane in the synthetic wide angle image which is created. Fora perfect hemispherical fisheye, the generated vectors point in alldirections within the created hemisphere, spaced to the limits of theresolution of the U-V image. This simulates a non-planar image grid,such as the projection of the synthetic environment onto a surface of aspherical image substrate. In this way, a fisheye or other wide angleimage of a synthetic three dimensional environment can be produced. Thistechnique can be used for the production of three dimensional modeledcartoons or interactive home gaming applications, among others.

One skilled in the art will appreciate that the present invention may beapplied to a sequence of wide angle images changing in time, either liveor recorded to an analog or digital storage media. The image substratefor recordation may be an electronic two dimensional image sensor, suchas a CCD chip, or photographic film capable of chemically recording theimage for subsequent transfer into digital form.

One skilled in the art will also appreciate that the present inventionis not limited to transforming wide angle video onto a planar (U-V)surface, but that it is within the scope of the invention to transformwide angle video onto any suitable surface for displaying the video forthe user.

Further, two real world, wide angle lenses can be positioned oppositeeach other to permit near 360 degrees of total coverage of theenvironment. If seamless omnidirectional coverage of an environment isrequired, this could be achieved with six wide angle lenses positionedalong each direction of a three dimensional axis, as shown in FIG. 8.This arrangement can be coupled with a video switching mechanism forchoosing which signal is to be dewarped for the selected view andorientation of the video input.

Further still, the same video signal may be simultaneously transmittedto an arbitrarily large number of viewers all having the ability tosimultaneously dewarp the same or different portions of the video input,as in the case of interactive cable TV viewing or multiple player onlineinteractive video game playing.

Having fully described the preferred embodiment of the presentinvention, it will be apparent to those of ordinary skill in the artthat numerous alternatives and equivalents exist which do not departfrom the invention set forth above. It is therefore to be understoodthat the present invention is not to be limited by the foregoingdescription, but only by the appended claims.

1. A method for providing perspective corrected images from at least onedistorted image, the method comprising steps of: receiving saiddistorted image; storing a portion of said distorted image; transforminga set of control vectors to a set of control points that defines an areathat associates said portion of said distorted image with a portion of aperspective corrected image; transforming said portion of said distortedimage associated with said area to said portion of said perspectivecorrected image using a global bivariate polynomial transformation;displaying said portion of said perspective corrected image; sensinginputted information; and controlling the transformation and display ofsaid perspective corrected image through said inputted information. 2.The method of claim 1 wherein the step of transforming said portion ofsaid distorted image is accomplished using: $\begin{matrix}{u = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{a_{ij}x^{i}y^{j}}}}} \\{v = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{b_{ij}x^{i}{y^{j}.}}}}}\end{matrix}$ as said global bivariate polynomial transformation.
 3. Themethod of claim 2 wherein N is 2 or
 3. 4. The method of claim 1 whereinsaid set of control points contains a number of control points, saidnumber having a lower limit of five and an upper limit of one-half ofthe number of pixels in said portion of said perspective correctedimage.
 5. The method of claim 1 wherein said distorted image is receivedfrom a computer storage device.
 6. The method of claim 1 wherein saiddistorted image is received from a network.
 7. The method of claim 6wherein said network is a computer network.
 8. The method of claim 6wherein said network is in communication with the Internet.
 9. Themethod of claim 1 wherein said distorted image is a result of imaging anenvironment through at least one wide-angle lens.
 10. The method ofclaim 1 wherein said distorted image is a result of imaging anenvironment through at least one fisheye lens.
 11. The method of claim 1wherein said distorted image includes an image of at least one sixth ofan environment.
 12. An apparatus for providing perspective correctedimages from at least one distorted image, the apparatus comprising: aninput configured to receive said distorted image; a memory, coupled tothe input, configured to store a portion of said distorted image; aprocessor, coupled to the memory, configured to transform a set ofcontrol vectors to a set of control points that defines an area thatassociates said portion of said distorted image with a portion of aperspective corrected image, the processor further configured totransform said portion of said distorted image associated with said areato said portion of said perspective corrected image using a globalbivariate polynomial transformation; a presentation mechanism, coupledto the memory, configured to present said portion of said perspectivecorrected image; and a selection mechanism, coupled to the processor,configured to specify said set of control vectors.
 13. The apparatus ofclaim 12 wherein the transformation of said portion of said distortedimage is accomplished using: $\begin{matrix}{u = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{a_{ij}x^{i}y^{j}}}}} \\{v = {\underset{i = 0}{\sum\limits^{N}}{\underset{j = 0}{\sum\limits^{N - i}}{b_{ij}x^{i}{y^{j}.}}}}}\end{matrix}$ as said global bivariate polynomial transformation. 14.The method of claim 13 wherein N is 2 or
 3. 15. The apparatus of claim12 wherein said set of control points contains a number of controlpoints, said number having a lower limit of five and an upper limit ofone-half of the number of pixels in said portion of said perspectivecorrected image.
 16. The apparatus of claim 12 wherein said distortedimage is received from a computer storage device.
 17. The apparatus ofclaim 12 wherein said distorted image is received from a network. 18.The apparatus of claim 17 wherein said network is a computer network.19. The apparatus of claim 17 wherein said network is in communicationwith the Internet.
 20. The apparatus of claim 12 wherein said distortedimage is a result of imaging an environment through at least onewide-angle lens.
 21. The apparatus of claim 12 wherein said distortedimage is a result of imaging an environment through at least one fisheyelens.
 22. The apparatus of claim 12 wherein said plurality of distortedimages includes an image of at least one sixth of an environment.
 23. Amethod of providing perspective corrected views of live, prerecorded orsimulated environments from wide angle video signals, the methodcomprising: receiving video input data at a computing device;transforming, by the computing device, a plurality of viewing vectorsdefining a portion of the video input data to a plurality of controlpoints; transforming, by the computing device, pixel data in an areabetween the plurality of control points to define perspective correctedpixel data; and sending the perspective corrected pixel data to adisplay for display of the perspective corrected pixel data.
 24. Themethod of claim 23 further comprising determining the plurality ofviewing vectors at the computing device based on position andorientation information of a device and a horizontal field of view and avertical field of view.
 25. The method of claim 23 further comprisingreceiving a selection of the plurality of viewing vectors at thecomputing device.
 26. The method of claim 23 wherein the plurality ofviewing vectors define an area of interest.
 27. The method of claim 23wherein a global bivariate polynomial transformation is used to definethe perspective corrected pixel data.
 28. The method of claim 27 whereinthe global bivariate polynomial transformation is a biquadraticpolynomial transformation.
 29. A system comprising: a computer-readablememory configured to store video input data; and a processor operablycoupled to the computer-readable memory to receive the stored videoinput data, the processor configured to perform operations comprisingtransforming a plurality of viewing vectors defining a portion of thestored video input data to a plurality of control points; transformingpixel data in an area between the plurality of control points to defineperspective corrected pixel data; and sending the perspective correctedpixel data to a display.
 30. The system of claim 29, wherein theprocessor is an ASIC.
 31. The system of claim 29, wherein thecomputer-readable memory has computer-executable instructions storedthereon execution of which by the processor causes the processor toperform the operations.
 32. The system of claim 29, wherein theoperations further comprise determining the plurality of viewing vectorsbased on position and orientation information of a device, a horizontalfield of view, and a vertical field of view.
 33. The system of claim 29,wherein the operations further comprise receiving a selection of theplurality of viewing vectors.
 34. The system of claim 29, wherein theplurality of viewing vectors define an area of interest.
 35. The systemof claim 29, wherein a global bivariate polynomial transformation isused to define the perspective corrected pixel data.
 36. The system ofclaim 35, wherein the global bivariate polynomial transformation is abiquadratic polynomial transformation.
 37. A method of performingperspective correction, the method comprising: transforming, by acomputing device, a set of control vectors into a set of control pointsusing a function that models a wide angle lens; generating, by thecomputing device, a polynomial transform function that maps the controlpoints into rectangular points; transforming, by the computing device,an area of image data proximate the set of control points using thepolynomial transform function; and sending the transformed area of imagedata to a display for display of the transformed area of image data. 38.The method of claim 37, wherein the polynomial transformation functioncomprises a global bivariate polynomial.
 39. The method of claim 37,wherein the polynomial transformation comprises a biquadraticpolynomial.
 40. The method of claim 37, wherein the set of controlvectors comprise a principal viewing vector and eight surroundingvectors that form a rectangular view.
 41. The method of claim 37,wherein the wide-angle lens comprises a fish eye lens.
 42. A systemcomprising: a computer-readable memory configured to store capturedimage data; and a processor operably coupled to the computer-readablememory to receive the stored captured image data, the processorconfigured to perform operations comprising transforming a set ofcontrol vectors into a set of control points using a function thatmodels a wide angle lens; generating a polynomial transform functionthat maps the control points into rectangular points; transforming anarea of the captured image data proximate the set of control pointsusing the polynomial transform function; and sending the transformedarea of image data to a display.